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Abstract. For fixed m > 1, we consider m independent nxn non-Hermitian 
random matrices X\ , . . . , X m with i.i.d. centered entries with a finite 
(2 + »;)-th moment, r\ > 0. As n tends to infinity, we show that the empirical 
spectral distribution of n~ m / 2 X1X2 ■ ■ ■ X m converges, with probability 1, to 
a non-random, rotationally invariant distribution with compact support in the 
complex plane. The limiting distribution is the m-th power of the circular law. 



1. Introduction and Formulation of Results 

Many important results in random matrix theory pertain to Hermitian random 
matrices. Two powerful tools used in this area are the moment method and the 
Sticltjcs transform. Unfortunately, these two techniques are not suitable for dealing 
with non- Hermitian random matrices, [6]. 

1.1. The Circular Law. One of the fundamental results in the study of non- 
Hermitian random matrices is the circular law. We begin by defining the empirical 
spectral distribution (ESD). 

Definition 1. Let X be a matrix of order n and let Ai, . . . , A„ be the eigenvalues 
of X. Then the empirical spectral distribution (ESD) fix of X is defined as 

Hx(z,z) = -#{k < n : Re(A fe ) < Re(z);Im(A fe ) < Im(z)} . 
n 

Let £ be a complex random variable with finite non-zero variance a 2 and let N n 
be a random matrix of order n with entires being i.i.d. copies of £. We say that the 
circular law holds for £ if, with probability 1, the ESD \i 1 N of converges 

(uniformly) to the uniform distribution over the unit disk as n tends to infinity. 

The circular law was conjectured in the 1950's as a non-Hermitian counterpart 
to Wigner's semi-circle law. The circular law was first shown by Mehta in 1967 [22] 
when £ is complex Gaussian. Mehta relied upon the joint density of the eigenvalues 
which was discovered by Ginibre |10) two years earlier. 

Building on the work of Girko [11] , Bai proved the circular law under the condi- 
tions that £ has finite sixth moment and that the joint distribution of the real and 
imaginary parts of £ has bounded density, [3] . In [6] , the sixth moment assumption 
was weakened to E|£| 2+?) for any specified rj > 0, but the bounded density assump- 
tion still remained. Gotze and Tikhomirov ([15]) proved the circular law in the case 
of i.i.d. sub-Gaussian matrix entries. Pan and Zhou proved the circular law for any 
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distribution £ with finite fourth moment |25] by building on |15j and utilizing the 
work of Rudclson and Vershynin in |27j . In an important development, Gotze and 
Tikhomirov showed in [14] that the expected spectral distribution E/zjv^ converges 
to the uniform distribution over the unit disk as n tends to infinity assuming that 
sup jfe E|(7V„) jfe | 2 <H(Ay jfc ) < co, where 0(ar) = (ln(l + |x|)) 19 +", ry > 0. In [28], 
Tao and Vu proved the circular law assuming a bounded (2 + rj) th moment, for 
any fixed r\ > 0. Finally, Tao and Vu have been able to remove the extra rj in the 
moment condition. Namely, they proved the circular law in [29] assuming only that 
the second moment is bounded. 



1.2. Main Results. In this paper, we study the ESD of the product 

of m independent n x n non-Hermitian random matrices as n tends to infinity. 
Burda, Janik, and Waclaw [8] studied the mathematical expectation of the limiting 
ESD, lim„_ i . 00 EjUj? , in the case that the entries of the matrices are Gaussian. 
Here we extend their results by proving the almost sure convergence of the ESD, 
, for a class of non-Gaussian random matrices. Namely, we require that the 

entries of X\ [ , i = 1, . . . , m, arc i.i.d. random variables with a finite moment of 
order 2 + 77, r\ > 0. 

Theorem 2. Fix m > 1 and let £ be a complex random variable with variance 1 
such that Re(£) and Im(£) are independent each with mean zero and E|£| 2+? ' < oo 
for some r\ > 0. Let x[ n \ . . . ,Xffl be independent random matrices of order n 
where the entries of X^ are i.i.d. copies of o~j-^ for some collection of positive 

constants a\ , . . . , a m . Then the ESD ^ of X^ = x[ n) X ( 2 n) ■ ■ ■ X^ ] converges, 
with probability 1, as n — > oo to the distribution whose density is given by 



(1) p(z,z) 
where a = o~i ■ ■ ■ a m . 



£a-^\z\^- 2 for\z\<a, 
for \z\ > cr, 



m 7T 



Remark 3. The almost sure convergence of implies the convergence of E/i^ 
as well. 

Remark 4. We refer the reader to [1] for bounds on powers of a square random 
matrix with i.i.d. entries. See also [TJ, [2], [5], [S], [7], and [53] for some other results 
on the spectral properties of products of random matrices. 



2. Notation and Setup 

The proof of Theorem [2] is divided into two parts and presented in Sections [3] 
and[4j 

We note that without loss of generality, we may assume a\ = 02 = • • • = a m = 1 . 
Indeed, the spectrum for arbitrary o~\, . . . , o~ m can be obtained by a trivial rescaling. 
Following Burda, Janik, and Waclaw in [5], we let be a (mn) x (mn) matrix 
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defined as 



(2) 



y00 



/ 




\ yfr) 

\ -ri-m 



X 



(n) 



X. 



{n) 






X 



(n) 



/ 



Section^] will be devoted to proving that the ESD of Y^ n > obeys the circular law 
as n tends to infinity. This statement is presented in the following Lemma. 

Lemma 5 (y(™) obeys the circular law). The ESD fj, Y (") of Y^ converges, with 
probability 1, to the uniform distribution over the unit disk as n — > oo. 

3. Proof of Theorem [2] 
With Lemma [5] above, we are ready to prove Theorem [2] 

Proof of Theorems^ Using the definition of Y"( n J in @, we can compute 



(y(»))' 



V o 



\ 



Y m J 



where Y k = Xf'X^ ■ ■ ■ X^'x[ n> ■ ■ ■ X^\ for 1 < fc < m. Notice that each Y k 
has the same eigenvalues as X^ n \ Let Ai, . . . , A„ denote the eigenvalues of X^ n ' 
and let rj±, ... , rj mn denote the eigenvalues of 7'"'. Then it follows that each A& is 
an eigenvalue of (y' n )) with multiplicity m. 

Let / : C — > C be a continuous, bounded function. Then we have 

n ran „ 

f(z)dn x(n ,(z,z)=-J2f(^) = — £/(C) = / /(^)dMy(n,(^^). 
By Lemma El 

r f(z m )dfi Y(n) (z,z) — > i / /(z m )dzdz 



1 

7T _ 

as 7i — > oo where D denotes the unit disk in the complex plane. Thus, by the change 
of variables z i— > z m and z i— > z m we can write 



1 



- / /(z m )dzdz=- / f( z )±\ z \^- 2 dzd: 

nr 



— I m\z\ 



' dzdi 



where the factor of m out front of the integral corresponds to the fact that the 
transformation maps the complex plane m times onto itself. 

Therefore, we have shown that for all continuous, bounded functions /, 

1 



/(z)d/j x( „) (z, z) — > / f{z)\z\ 

mir 



' dzdz 



as n — > oo and the proof is complete. 



□ 
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4. Proof of Lemma [5] 

In order to prove that the ESD of Y^ n > obeys the circular law, we follow the 
work of Bai in [3J , Bai and Silverstein in [5J , and use the results developed by Tao 
and Vu in [28 . To do so, we introduce the following notation. Let fi n denoted the 
ESD of y("). That is, 

u n (x,y) = — < mn : Re(A fc ) < x;lm(X k ) < y} 
mn 

where Ai, . . . , \ mn are the eigenvalues of Y^ n '. 

An important idea in the proof is to analyze the Stieltjes transformation s n : 
C — > C of fi n defined by 

1 mn 1 f 1 
s n (z) = Vt = / — — du n (x,y). 

Since s n (z) is analytic everywhere except the poles, the real part determines the 
eigenvalues. Let z = s + it. Then we can write 



_^ ran 

ReM*)) = 



Re(A fe ) 



mn |A fc - z\ 2 



in n 



2mn ' ds 

k=l 



|-ln|A fe -z| 2 



1 d [°° 

= - 7;tt / ^xv n {Ax,z) 

2 ds J 



where v n {-,z) is the ESD of the Hermitian matrix H n = (Y^ - zI)*{Y^ - zl). 
This reduces the task to controlling the distributions v n . 

The main difficulties arise from the two poles of the log function, at oo and 0. 
We will need to use the bounds developed in [3] and [35] to control the largest 
singular value and the least singular value of Y^ — zl. 

A version of the following lemma was first presented by Girko, [11] . We present 
a slightly refined version by Bai and Silverstein, [B]. 

Lemma 6. For any uv ^ 7 we have 

c n (u,v)= [ [ e" + «"V«(d^dj/) 



(3) 



where z=s+it. 



0_ 

Aiwk I I ds 



lna;i^„(da;, z) 



o 



e ius+ivt dtd 



We note that the singular values of Y^ are the union of the singular values 
of X% for 1 < k < n. Thus, under the assumptions of Theorem [21 the ESD of 
converges to the Marchenko-Pastur Law (see [3D] and [BJ Theorem 3.7]). 
Thus by Lemma [5J it follows that, with probability 1, the family of distributions 
fi n is tight. To prove the circular law we will show that the right-hand side of (J3j 
converges to c{u, v), its counterpart generated by the circular law, for all uv ^ 0. 
Several steps of the proof will follow closely the work of Bai in [3J and Bai and 
Silverstein in [6j. We present an outline of the proof as follows. 
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(1) We reduce the range of integration to a finite rectangle in Section l4~2l We 
will show that the proof reduces to showing that, for every large A > and 
small e > 0, 

' d_ r 

ds J 

— II J~ / lnxv(dx,z) 



lnxv n (dx, z) 



ius+ivt 



dsdt 



ius-\-ivt 



dsdt 



where T = {(s,t) : \s\ < A, \t\ < A 3 , \ v / s 2 + t 2 - 1| > e} and v{x,z) is 
the limiting spectral distribution of the sequence of matrices H n = (y( n ) — 
zI)*(YW - zl). 

(2) We characterize the limiting spectrum z) of v n (-, z). 

(3) We establish a convergence rate of v n (-,z) to v{-,z) uniformly in every 
bounded region of z. 

(4) Finally, we show that for a suitably defined sequence e„, with probability 
1, 



lim sup 



and 



lim 

n— f oo 



hix(u n (dx, z) — v(d.x, z)) 



h\xv n {dx, z) = 0. 



= 



4.1. Notation. In this section, we introduce some notation that we will use through- 
out the paper. 

First, we will drop the superscript (n) from the matrices Y (™) , X^ n \ X[ n ^ , . . . , Xm^ 
and simply write Y, X, X%, . . . , X m . 

We write R = Y — zl where J is the identity matrix and z = s + it G C. We 
will continue to let H n = (Y — zI)*(Y — zl) = R*R and have v n {x, z) denote the 
empirical spectral distribution of H n for each fixed z. 

For a (mn) x (ran) matrix A, there are m 2 blocks each consisting of a n x n 
matrix. We let A a b denote the n x n matrix in position a, b where 1 < a, b < m. 
A a ,b;i,j then refers to the element (A a b)ij where 1 < i, j < n. 

Finally, C will be used as some positive constant that may change from line to 
line. 

4.2. Integral Range Reduction. To establish Lemma we need to find the 
limiting counterpart to 

d 



9n{s,t) 



OS 



lnxv n (dx, z). 



We begin by presenting the following lemmas. 
Lemma 7 (Bai-Silverstein [B]). For all uv ^ 0, we have 



c(u, v) = — 

7T 



dxdy = 



x 2 +y 2 <l 



g(s,t)e 



ius-\-ivt 



dt 



ds, 



wht 



if s 2 + t 2 > 1 



g(s t) = < s +* 

' 2s, otherwise 
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Lemma 8 (Horn- Johnson [17j). Let Xj andr/j denote the eigenvalues and singular 
values of an n x n matrix A, respectively. Then for any k < n, 

X>i 2 <X> 2 

3=1 3=1 

if rjj is arranged in descending order. 

Lemma 9 (Bai-Silverstein [5]). For any uv =^ and A > 2, we have 

[ ^ g n (s,t)e^dtds <^Le-^ A + ^^l(\X k \>^) 

J\ S \>aJ-oo \v\ n \ V \f~i V 2 J 



g n (s,t)e ms+mt dtds 

\s\<A Jt>A 3 



8A 4nA^ Tny . .. 

k=l 



where Ai,...,A„ m are the eigenvalues of Y . Furthermore, if the function g n (s,t) 
is replaced by g(s,t), the two inequalities above hold without the second terms. 

Now we note that under the assumptions of Theorem [2] and by Lemma [5] and 
the law of large numbers we have 

ran 

-E^N)^^)-^ a., 

k=l 

Therefore, the right-hand sides of the inequalities in Lemma |H] can be made arbi- 
trarily small by making A large enough. The same is true when g n (s, t) is replaced 
by g(s,t). Our task is then reduced to showing 

[g n (s,t) ~g(s,t)]e ms+mt dsdt^ 0. 

1 \s\<A J \t\<A 3 

We define the sets 

T = {(s,t) : |s| < A, |*| < A 3 and ||z| — lj > e} 

and 

T 1 = {(«,t):||z-l|<e} ) 

where z = s + it. 

Lemma 10 (Bai-Silverstein [6 ). For all fixed A and < e < 1, 

(4) J \g n {s,t)\dsdt<^e. 

Furthermore, if the function g n {s,t) is replaced by g(s,t), the inequality above holds. 

Since the right-hand side of ((4]) can be made arbitrarily small by choosing e 
small, our task is reduced to showing 

(5) J J^[g n { S ,t)-g{s,t)]e lus+mt d S dt^Q a.s. 

4.3. Characterization of the Circular Law. In this section, we study the con- 
vergence of the distributions v n (x,z) to a limiting distribution v(x,z) as well as 
discuss properties of the limiting distribution v(x,z). We begin with a standard 
truncation argument which can be found, for example, in [6]. 
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4.3.1. Truncation. Let Y and Y be the (mn) x (mn) matrices with entries 

Y a ,b;i,j = Y aMhJ I(^n~\Y aMU \ < n s ) - EY aAid I(V^\Y aAid \ < n s ) 
and ^ 



Ya,b;i ,j 



nE 



2 



where S > 0. We denote the ESD of H n = (Y - zT)*(Y - zl) by u n (-, z) and the 
ESD of H n = (Y- zI)*(Y - zl) by u n (-,z). 

We will let L{F\,F2) be the Levy distance between two distribution functions 
Fi and F^ defined by 

L(F 1 ,F 2 ) = inf{e : Fi(x - e) - e < F 2 (x) < Fi(x + e) + e for all a: G M}. 

We then have the following Lemma. 

Lemma 11. We have that 

L{v n {-,z),D n {-,z)) = o{n-' s ' i ) a.s. 

where the bound is uniform for \z\ < M . 

Proof. By [6j Corollary A. 42] we have that 

L\u(-,z),u n (;z)) < -^Tr(H n -H n )Tr[(Y -Y)*(Y -?)}. 
n 

By the law of large numbers it follows that, with probability 1, 
1 1 m 

-TTH n = -Y] V \Y a , a+1 .^\ 2 + m\z\ 2 ^m(l + \z\ 2 ). 
n n * — ' f — ' 

a— 1 l<ij<n 

Similarly, ~Tr(H n ) -> m(l + \z\ 2 ) a.s. 
For any L > 0, we have 



_xr[(r-?)*(y -?)] = — ]T £ Kr-n^+i^l 2 

a— 1 l<2,j<n 



a— 1 



\ a=l l<ij'<n y 



a—l l<z,j<n 



n 5r < 



<-^E E \v^Y a , a+1;i , j \ 2 +n(v^\Y aia+1 . i , j \ > L)+E\e +n im > l) 

c 

and hence 

limsup — Tr[(y - - ?)] < 4toE|£| 2+, '/(|£| > L) a.s. 

which can be made arbitrarily small by making L large. Thus we have that 

L(v(.,z),P n (-,z)) = (n~" 5 / 4 ) a.s. 
where the bound is uniform for \z\ < M. 
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By Corollary A. 42] we also have that 



L i (V{-,z),V n (- 1 z))<^rTr{H n + H n )Tr{Y*Y) I 1 



E|VHKi, 2; 



1,1 



A similar argument shows that 1 — y Ely^nYi^;!,!! 2 = o(n vS ) and the proof is 
complete. 

□ 

Remark 12. For the remainder of the subsection, we will assume the conditions of 
Theorem [2] hold. Also, by Lemma HT1 we additionally assume that |Ya,a+i ; i.j| < n s . 

4.3.2. Useful tools and lemmas. We begin by denoting the Stieltjes transform of 
u n (-,z) by 

v n (dx,z) 



A n (a,z) = J 



where a — x + iy with y > 0. We also note that A n (a,z) = — Tr(G) where 
G = (H n — al)^ 1 is the resolvent matrix. For brevity, the variable z will be 
suppressed when there is no confusion and we will simply write A n (a). 

We first present a number of lemmas that we will need to study A n (a). We 
remind the reader that R = Y — zl and a = x + iy. 

Lemma 13. If y > and x G K for some compact set K , then we have the 
following bounds, 



(6) \\Y\\ 2 < mzx ||A fe || 2 <]T||A 

Kk<m f — ■ 

(7) ||G|| < 



fell 2 ! 



fe = l 

1 



y 



l*) \\RG\\<C s j\ + -, 

y y 



(9) \\GR*\\<CJ^ + -, 

V r y 

for some constant C > which depends on K . Moreover, there exists a constant C 
which depends only on K such that 



(10) sup{||i?G|| :xeK,y>y n ,zeC}<C< 



1 1 
7$ ~ 



(11) sup{||Gi?*|| :xeK,y>y n ,z£C}<cJ^ + —, 

y 2/n Vn 

for any sequence y n > 0. 

Proof. The first inequality in ((6]) follows from the definition of the norm and the 
second inequality is trivial. The resolvent bound in ([7]) follows immediately because 
H n is a Hermitian matrix. 
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To prove ([8]), we use polar decomposition to write R = U\R\ where U is a partial 
isometry and \R\ = V R*R. Then 

\\RG\\ = WUlR^R-a)- 1 ]] < WlRftnrR-a)-^ 



< 



\Vt{t - a)" 1 ! < sup \Vt(t - a)" 1 ! < Cj\ + - 



teSp(R*R) t>o V v y 



A similar argument verifies © ■ (ITU1) and (JTTJ) follow from ((SJ) and ([§]) by using that 

y > y n - n 

Lemma 14. FFe fraue i/iai 











E 


— TrG a a 


= E 


— TrG 




n 




mn 



for any I < a < m. 

Proof. Fix 1 < a < m and 1 < i < n. We will show that 

Using the adjoint formula for the inverse of a matrix, we can write that for any 
1 < b < m 



G 



b,b:i,i 



det (R*R - al) 



det - al) 

where (R* R — al)^' 1 ^ is the matrix R*R— al with the entries in the row and 
column that contain the element (R*R — ctl)b t b;i,i replaced by zeroes except for the 
diagonal element which is replaced by a 1. 

We will write Qb = X^X b + \z\ 2 I - al and then note that R*R - al has the 
form 



(12) 



-zX* 






V -zX„ 



Qi 
-zX$ 








-zX 2 











- z X * 










-zX m -2 

Qm-2 

~zX rri _-t 



-zX* 






-z~X m -i 
Qm-l / 



where Q mi Qi, ■ ■ . , Q m -i appear along the diagonal. 

Let <7= (12 3... m) £ S m . We now construct two bijective maps. Let T a be the 
map that takes matrices of the form (fT2)) into the matrix where each occurrence of 
Xb is replaced by X a ^) and each occurrence of Qb is replaced by Q a {b)- Also, let 

!] = C" 2 xC" ! x-x C" 2 



m times 

denote the probability space. Then we write w € fi as a; = (Xi, X 2 , . . . , A m ). We 
now define T' a : ft -> Q by T' (J {X ll . . . , X m ) = (X 2 , X 3 , . . . , X m , Xi). Since each 
X\ , . . . , X m is an independent and identically distributed random matrix, T' a is a 
measure preserving map. 

We claim that det(R* R— al) = det (T a (R* R — al)). Indeed, if A is an eigenvalue 
of (R* R—al) with eigenvector v = (v mi v±, . . . , v m -i) T where v b is an n-vector, then 
a simple computation reveals that w — (Wo-( m ), Va(i) , • ■ • , w <r(m-i)) T = ( u i7 • ■ • > v m) T 
is an eigenvector of T a (R*R — al) with eigenvalue A. 
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Similarly, det (R*R - al) (M) = dot {T a {{R*R - al) (M) )) . Define f a ,i(u}) to 
be det (R*R — al)^' 1 ^ (lu) for each realization w € 0. Then we have that 
U+M = det (R*R - aI) (a+U) (w) 

= det (T CT ((iri?-a/) (a+M) )) (w) 

= det (T a (R*R-aI)) (a ' l) (lu) 

= det (iTi? - ai)^ (T») = /„,*(!») 

and 

det(R*R - al)(u)) = det(R*R- aI)(T' a (Lu)) . 

Thus Go,a;i,i(^(w)) = G a +i,o+i;i,i ( w ) for each lu € O. Since T£ is measure pre- 
serving, the proof is complete. □ 

Next, we present the decoupling formula, which can be found, for example, in 
|18j . If £ is a real- valued random variable such that E|£| p+2 < oo and if f(t) is a 
complex- valued function of a real variable such that its first p + 1 derivatives are 
continuous and bounded, then 



(13) EK/(0]=X;^±lE[/W(0]+e l 

^— ' a! 

a=0 

where K a are thecumulantsof£and | e | < Csup t \f^ +1 ^{t)\E\^\P +2 where C depends 
only on p. 

If £ is a Gaussian random variable with mean zero, then all the cumulants vanish 
except for K2 and the decoupling formula reduces to the exact equation 

mm)] = nemna 



Finally, to use (| 13[) . we need to compute the derivatives of the resolvent matrix G 
with respect to the various entries of Y. This can be done by utilizing the resolvent 
identity and we find 

dG a hh,i 



<9Re(F c , c+ i ;M ,) 

9G a ,b;k,l 

d \m(Y c 



c,c+l;g,; 



-{GR*) atC -k,qG c +l t b ;p J — G a ,c+l;k,p{RG) c ^;q,l, 
-i{GR*)a,c;k,qGc+l,b;p,l + iGa,c+l;k,p 



4.3.3. Main Theorem. For the results below, we will consider a — x + iy where 
V > Vn with y n = n~ vS . Our goal is to establish the following result. 

Theorem 15. Under the conditions of Theorem^ and the additional assumption 
that |Ya,a+i:i,.j'l ^ n<5 > we have 

Al(a, z) + 2A 2 (a, z) + — A n (a, z) + - = r n (a, z), 

a a 

where if 5 is chosen such that Srj < 1/32 and S < 1/32, then the remainder term 
r n satisfies 

sup{|r„(a,z)| : \z\ < M,\x\ < N,y>y n } = O (S n ) a.s. 



with 5 n = n 1 ^ 4 y n 5 
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Remark 16. We note that the bounds presented here and in the rest of this section 
are not optimal and can be improved. The bounds given, however, are sufficient 
for our purposes. 

In order to prove Theorem 1151 we will need the following lemmas. The first 
lemma is McDiarmid's Concentration Inequality [21] , 

Lemma 17 (McDiarmid's Concentration Inequality). Let X = (Xi, X%, . . . , X n ) 
be a family of independent random variables with X k taking values in a set A k for 
each k. Suppose that the real-valued f defined on Yi^k satisfies 

\f(x)-f(x')\<c k 

whenever the vectors x and x' differ only in the kth coordinate. Let fi be the expected 
value of the random variable f(X). Then for any t > 0, 

P(|/(X)-/i| >t)< 2e- 2 * 2 /£< 

Remark 18. McDiarmid's Concentration Inequality also applies to complex- valued 
functions by applying Lemma [171 to the real part and imaginary part separately. 

Lemma 19. For y > y n and \x\ < N (where a — x + iy), 

(14) P (\A n (a, z) - EA„ (a, z)\ > t) < 

for some absolute constant c > 0. Moreover, 

sup{|A„(a,z)-EA n (a,z)| : \z\ < M, \x\ < N,y > y n } = O (n^^y- 2 

Proof. Let R k denote the matrix 7? with the k-th column replaced by zeroes. Then 
R*R and R k Rk differ by a matrix with rank at most two. So by the resolvent 
identity 



— Tr(R*R-a) 1 - — Tr {R* k R k - a)" 1 



(15) 



< 



mn 
C 



(R*R — a) 1 (R* k R k -R*R)(R* k R k ~ a y 1 

- sup \(t- a)~H\ = C'At 

ny n t>o ny^ 

where the constant C depends only on N. The mn columns of Y form an indepen- 
dent family of random variables. We now apply Lemma [17] to the complex- valued 
function ^Tr (R*R — a) -1 with the bound Cfe = 0(n~ 1 y~ 2 ) obtained in ([T5]) . 
This proves the bound (|14[) . Thus, for any fixed point (a, z) in the region 

(16) {{a = x + iy,z = s + it) : \x\ < N,y>y n ,\z\ < M] 

one has 



(|A„(a, z) - EA„(q, z)\ > n-^y- 2 ^} < Ae 



where we recall that y n = n ,,s and S > could be chosen to be arbitrary small. 
If y = Ima > n 1 / 4 ?/ 2 , then 
1 



(18) | A „(a,z)|<-— <n-^y- 2 , 

1m a 



|EA„(a,z)| Kn-^y- 2 . 



Therefore, it is enough to bound the supremum of |A„(a) — EA n (a)| over the region 
(19) X> = {{a = x + iy,z = s + it) : |z| < N, y n <y<n x ^y 2 n , \z\ < M}. 
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To this end, we consider a finite n~ c -net of T> where C is some sufficiently large 
positive constant to be chosen later. Clearly, one can construct such a net that 
contains at most [4Mn 4C n 1 ^ 4 y'^} points if n is sufficiently large, where [k] denotes 
the integer part of k. Let us denote these points by (c^, z;), 1 < i < [AMn ic n 1 / ^y 2 ^. 
It follows from (jTTJ) that one has 

(20) P (sup{i : \A n ( ai , Zi ) -EA n ( ai , Zi )\ > n^^y- 2 ) < l6My 2 n n 4C+1 ^e' cnl/2 , 

where the suprcmum is taken over the points of the net. Appying the Borel-Cantelli 
lemma, we obtain that 



(21) 



sup{i : \£a n {oti,z i )-'EA n {ai,Zi)\} = O (n" 1 / 4 ^ 2 ) 



where the supremum is again taken over the points of the n c -net of T>. To extend 
the estimate (|21j) to the supremum over the whole region T>, we note that for 









(22) 


dA n (a, z) 
d Re a 


1 

^ — ' 

yi 


(23) 


dA n (a,z) 
dim a 


l 

^ — . 

vi 


(24) 


dA n (a,z) 
dRez 


< const 


(25) 


dA n (a,z) 
d Im z 


< const 



2(n 1+s + M) 

■ 5 ' 

Vn 

2(n 1+s + M) 



where const m is a constant that depends only on m. 

The bounds (|22H23I) are simple properties of the Stieltjes transform. Indeed, the 
l.h.s. of ([22]) and ([23]) are bounded from above by ■ The proof of (|24II25[) 

follows from the resolvent identitity 

( J ff„(z 2 )-a/)- 1 -( J ff„(z 1 )-a/)- 1 = ( J ff„(z 1 )-a/)- 1 (i/ n (z 2 )-i/ n (z 1 ))(i/ n (z 2 )- a /)- 1 , 
the formula H n {z) = (Y^ - zI)*{Y^ - zl), the bound \z\ < M, and the bound 
(26) \\Y (n) \\ <n 1+s . 

We note that (|2"6T) follows from the fact that the matrix entries of Y^ are bounded 



by n s . 

Now, choosing C in the construction of the net sufficiently large, one extends 
the bound ([2T]) to the whole region V by (122H25p . This finishes the proof of the 
lemma. □ 

Lemma 20. For any 1 < a < m, 

sup < Var 



(l^Gaa^j ■■ \x\<N,y> y n ,ze cj = O^y- 2 ) 



where a — x + iy. 



Proof. Let Rk denote the matrix R with the fc-th column replaced by zeroes and 
let P a be the orthogonal projector such that TrG a ,a = Tr(P a GP a )- Following the 
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same procedure as in the proof of Lemma I19[ we have that 

\Tr(R*R - a)~i - Tr (R* k R k - a)-\\ 
(27) = |Tr [P a (RlR k -a)- 1 (R* k R k -R*R)(R*R-a)- 1 P a ]\ 

C 

< 



vl' 

where the constant G depends only on N. 
We can write 



-TrG a , a - E 
n 



— TrG a a 
n 



^ mn 
n z — j 



fe=l 



where j k is the martingale difference sequence 











E fc 


— TrG a a 


- E fe _i 


-TrG a , a 




n 




n 



and Efc denotes the conditional expectation with respect to the elements in the first 
k columns of Y. Then by the bound in (l27l) and [BJ Lemma 2.12], we have 

2 

i i 

E 



-TrG a , a - EiTrG a , a 
n n 



4sw 



fc=i 



< 



G 



where the constant G depends only on N. Since the bound holds for any |x| < 
N,y > y n , and z € C the proof is complete. □ 

Remark 21. By Lemmas fl4l fT9l andl20l for every 1 < a, b, c < m 



E 



1. 



-TrG a a 



1 



-TrG 



1 

mn 



E 



-TrGa.a-TrGh.f, 

n n 



-TrG 



TrG + 0(n- 1/4 y- 5 ) 



+ 0(n- 1 / i y- 5 ) 



a.s. 



-TrG] +0(n- 1 /^-5) 



a.s., 



and 



E 



-TrG a!a -TrG6 !6 -TrG c 
n n n 







= E 









+ 0(n- 1 /^ ) 
— TrG) +0(n- 1 / 4 y- 5 ) a.s., 



where the bounds hold uniformly in the region \x\ < N,y > j/ n , and |z| < M. 

We are now ready to prove Theorem 1151 

Proof of Theorem \15l Fix a = x + iy with \x\ < N,y > y n and z € C with \z\ < M. 
We will show that the remainder term r n (a, z) = 0(<5 n ) a.s. where the constants in 
the term 0(8 n ) depend only on N and M. In particular, the remainder term will 
be estimated using Lemmas [13] and [19] and Remark [21] where the bounds all hold 
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uniformly in the region. In the proof presented below, will use the notation On,m(-) 
to represent a term which is bounded uniformly in the region \x\ < N, y > y n , and 
\z\ < M. 

By applying the resolvent identity to G and replacing R and R* with Y — zl 
and Y* — zl, respectively, we obtain 

-TrG Q . Q = -- + —Tr[GY*Y} a , a - —Tr[GY*} a , a - ^Tr[Gr] Q . a + ^TrG a , a . 
n a an an an an 

We will let Y^ be the (mn) x (mn) matrix containing the real entries of Y and 
Y^ be the (mn) x (mn) matrix containing the imaginary entries of Y such that 
Y = y( r ) By assumption, Y^ and yM are independent random matrices. 

Thus, 

1 _ K\ E-TrG a , a + - = — ETr(GF*yW) a , a + —ETr(GY*Y^) a . a 
a J n a an an 

(28) - — ETr(GY^\ a + —ETr(GY^*) a , a 

an an 

- — ETr(GF«) a . a - — ETr(GF«) a , a 
an an 

Let 8 — Var(Re(£)). Then Var(Im(£)) = 1 — 6. To compute the expectation, we 
fix all matrix entries except one and integrate with respect to that entry. Thus, by 
applying the decoupling formula (fT3| with p = 1 and using the fact that Y a ^i,j = 
whenever b ^= a + 1 , we obtain the following expansions for the terms on the right- 
hand side of (1281). 



— ETr(GF*yW) a . a = — E V G a , 0;J -, fe y„_i. a;Z . fc Re {Y a . lA , h] ) 

run run * — » 



an an 

l<j,k,l<n 



S 5 

ETrG a a -E } y a -l,a;l,k ((GR*)a,a-l:jjG aa -jk) 

an an 2 - 

l<j,k,l<n 

2^ Y a _x,a;l,k {Ga,a;j,j(RG) a -X >a] l t k) + OiV,M 

an l<j,k,l<n 



n l/2y 



S 5 ( n s 

= — ETrG a . a ^E[FrG a ,aTr(RGY*) a -i,a-i]+Oif,M ^ttt 

an an z \ n ' Vn 

Here we use that the e error term in (|13[) contains the second derivative 

d 2 (Gy*k a;J , ; / i 

dRe(Y a , „ :/ ,i 2 V //;! 

which consists of several terms each bounded by Lemma [13l After summing over 
1 < j, I < n and utilizing the fact that the third moment of Re (Y a -i )a ;l,j) is of 
order n s ~ 3 ^ 2 , we obtain an error bound of On,m ("• <5 ~ 1 ^ 2 J/n 4 )- By following the 
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same procedure for the other terms, we obtain 



— ETr(Gr*y (l) ) QiQ = — E V (? , 0!i , k y„_i, 0i /, Jk Im(y _i 1 „ iJli ) 



1 S 1 s 

-ETrG a , a - — ^E [TrG a , a Tr(i?Gr*) a _ 1 , a _ 1 ] + N . 



an an 



— ETr(GFW*) a , a = — E £ (;„.„ : ,j, lir ;>;,.„. : 



an an 

\<j,k<n 

zo I n 

= — E [TrG a+1 a + lTr{GR*) a , a ] + N , M 



ETr(GF«*) a , a = — E V G a , a+1;j - fc Im(F aia+1;i , fc ) 
an an ^ — ' 

\<j,k<n 



' A) ^,[TrG Q+liQ+1 Tr(Gi?*) a , a ] +0 N , A1 ( " 



ju [j.i^a-)-i )a.a\ i ^ iv . jvj i i/o ^ 

— ETr(GyW) a , a = — E V (.',,.„ I:/ a !{.->;, 

an an 

l<j,fc<n 



z<5 „ r „ „ _ , „ „, , in 



6 

-^E[TrG a , a Tr(i?G) a _i, a _i] +O n . m ,' .", , ■• 
an z \n L ' z y^J 



and 



— ETr(GF«) a , a = — E V G a , a _ 1;j , fc Im(r o _i )< , iW ) 
an an ' 

l<j,k<n 



Z(1 / } E [TrG a , a Tr(i?G) a _ 1 , a _ 1 ] + W)M ^ "' 



Combining these terms yields, 

1 _ MJ\ E— TrG a a = -- + — v Ti G„ ,, - -Le pIM? 0)0 T>(iZGiS*) _ ll0 _i] 
a J n a an an z 

z , „ / n s 



r E[TrG a+1 , a+1 Tr(Gi?*) Q , Q ] + 



N.M 



« 1/2 2/n 



I + ±ETrG a , a - — jE [TrG a . Q Tr(i?Gi?*) a _ 1 , a _ 1 ] 
a an an z 

+ ^-E[TrG a+1 , a+1 Tr(Gi?*i?) Q , Q ] 



1 , „ / n 



t- ^E[TrG a+1 , a+1 Tr(Gi?*F) a . a ] + 0^ ( 

an z \ n ' Un 

We note that by Remark [2H we have that 

^E[TrG a+1 . Q+1 Tr(Gi?*i?) a , a ] = -— ETrG a+1 , Q+1 
an an 

- * E TrG a , a ETrG + NM (S n ) 
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and 

— 4-E [FrG a a ,Tr(RGR*) a -i, a -i] = ^E [TrG a , a Tr(R*RG) a , a ] 

an z an z 



- [TrG Q Q (TrGa.a - TtG a -i, a -i)] 

an z 

= -— ETrGa.a - -^ETrG a . a ETrG + N , M (6 n ). 
an mn z 

Finally, we expand Y in terms of Y"M and Y 1 - 1 ' and again apply the decoupling 
formula (fT5j) to obtain 

1 1 / n 5 

— 2^ [Ti'G a+1 . a+1 Tr(Gi?*y) a . a ] = .E [TrG a+1 . a+1 TrG a , a TrG a . a ] + NM 

an z n a \n L < y 



= ^ (ETrG) 2 ETrG a , a + N , M (S n ), 

n a m 

where the last equality comes from Remark I2T1 Therefore, we have that 

1 _ JfL^ E— TrGa.a =--- — ETrGa+i.a+i - — — o ETrG a a ETrG 
a J n a an mn^ 

- -^U- (ETrG) 2 ETrGa a + N , M (6 n ). 
n A m z 

By summing over a and dividing by m, we obtain 

(EA„(a)) 3 + 2 (EA„(a)) 2 + 1 + - ~ ^ EA n (a) + -= N . M (8 n ). 

a a 

Thus, the proof is complete by Lemma fT9l □ 
Consider the cubic equation 

(29) A 3 + 2A 2 + a + 1 ~ |z|2 A + - = 

a a 

where a = x + iy. The solution of the equation has three analytic branches when 
a 7^ and when there is no multiple root. Below we show that the Sticltjcs 
transform of v n (-,z) converges to a root of (|29j) . Following the argument of Bai 
and Silverstein in [BJ, we have that there is only one of the three analytic branches, 
denoted by A (a), to which the Stieltjes transforms are converging to. We let 7712(a) 
and 7713(a) denote the other two branches and note that A, 7712, and 7713 are also 
functions of \z\. 

By [BJ Theorem B.9], there exists a distribution function v{-,z) such that 

A (a) = / v(du,z). 

J u — a 

Then we use the following Lemmas due to Bai and Silverstein, [BJ . 

Lemma 22. The limiting distribution function v(x, z) satisfies 

2 

\u(w + u, z) — v{w, z)\ < — max{2y 3|u|, \u\\ 

TT 
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for all z. Also, the limiting distribution function u(u,z) has support in the interval 
[xi,X2] when \z\ > 1 and [0,22] when \z\ < I, where 
1 



X2 



M 2 
1 



-1 + 20|z| 2 + 8|z| 4 - (Vl + 8|z| 2 )= 



8UI 



(v/l + 8|z| 2 ) 3 -l + 20|z| 2 + 8|zr 



Lemma 23. For any given constants N > 0, A > 0, and e G (0, 1) (recall that A 
and e are used to define the region T), there exist positive constants ei and e (e$ 
may depend on e\) such that for all large n, 
(i) for \a\ < N, y > 0, and z E T, 

m ax|A(a) -mj(a)\ > e , 

J=2,3 

(ii) for \a\ < N, y > 7 \a — X2I > ei ( and \a — xi\ > ei if \z\ > 1 + e), and z G T , 

min l A («) -m 3 {a)\ > e , 

J=2,3 

fmj /or z £ T and \a — x%\ < £1, 



min I A (a) — mj(a)\ > eo\/\a — x 2 \, 
.7=2,3 



(iv) for \z\ > 1 + e, 2 € T , and \a — x±\ < t\ 



rain |A(a) — 771,(0)1 > eQ^J\a — x\\. 

j=2,3 



Remark 24. Lemma l23l shows that away from the real line, A is distinct from the 
branches 1112 and 772,3. 



Lemma 25. We have 



d f°° 

— / lnxv(dx, z) = g{s,t). 



Remark 26. Lemma [25] shows that v(-,z) is the distribution which corresponds 
to the circular law. 

4.4. Rate of Convergence of v n (x,z). For this subsection, we return to the 
original assumptions on the entries of Y. Before we prove Lemma [SJ we need 
to establish a rate of convergence of u n (x,z) to v(x,z). We remind the reader 
that v n (-,z) is the ESD of H n = (Y - zI)*(Y - zl) and v n (-,z) is the ESD of 
H n = (Y-zI)*(Y-zI). 

Lemma 27. For any M 2 > Mi > 0, 

sup \\v n (-,z) - v{-, z)\\ = sup \v n {x, z) - v(x, z)\ = 0(n~ s,l/8 ). 

M!<\z\<M 2 a;,Mi<|2|<M2 

Proof. We first note that it is enough to show 
(30) 



sup \\v n {-,z)-v(-,z)\\=0{^/y^). 

Mi<\z\<M 2 

Indeed, by Lemma [Til we have that 

L(v n (-,z),u(-,z)) < L(v n (-,z),v n {-,z)) + \\v n (;z) - v(-,z)\\ 
< \\V n {-,z)-v(-,z)\\+o{n-^l i ). 
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and by Lemma l22l 



\v n (-,z) ~ v{-- z )\\ < Cy/L(i/ n (-,z),i/(-,z)) 



uniformly for \z\ < M. 

We now prove (|30]) . Since |A„(ao)| < (Imao)" 1 for any fixed ao with Ima > 0, 
there exists a convergent subsequence of {A n (ao)}^L 1 . Since A is the only branch 
of ([2D]) that defines a Sticltjcs transform, the subsequence must converge to A(ao)- 
Hence, A„(ao) —> A(ao) as n — > oo for any fixed «o with Imao > 0. Let mi = A 
and 77i2 and 777,3 be the other two branches of the cubic equation (|29p . 

We remind the reader that T is a bounded set and that the supports of ^(-, z) are 
bounded for all z € T. So by [H Corollary B.15] there exists N and some absolute 
constant C such that 

\\v n {-,z) - v{-,z)\\ 



<C\ I |A„(a) - A (a) I da; H sup/ \v(x + y, z) - u(x, z)\dy 

<j\<2Vn J 



< C 



[ f \A n (a) - A(a)\dx + — sup / 

\./|a:|<iV Vn x J\ y [ 

f / |A„(a)- A{a)\dx + ^yZ) 

V^|x|<JV / 



where a = x + iy n and the last inequality follows from Lemma 1221 So, to complete 
the proof we only need to estimate the integral in the last inequality above. 

We first show that for a = x + iy, |x| < N, |x— x 2 | > ei (|x — Xi| > ei if |z| < 1), 
V > 2M, Mi < |z| < M 2 , and all large 77, 

(31) |A„(a)-A(a)| < ^5 n 

where eo and ei come from Lemma [23] and C is a positive constant. By Theorem 
1151 consider a realization where 

|A n (a) - A(a)||A„(a) - m 2 (a)||A„(a) - m 3 (a)| < C— e^„. 

for some positive constant C". Fix ao = xo + iyo with |xo| < N,yo > 0, and 
minfe^i^ |xo — Xfe| > ei. Fix z 6 T. Choose 77 large enough such that |A„(ao) — 
A(a )| < f • Then for fc G {1, 2}, 

e < |A(ao) - 777fc(ao)| < |A(a ) - A„(a )| + |A„(a ) - m k (a )\ 
and hence 

min |A„(a ) - 777fc(a )| > -g-- 

Thus, 

|A„(« )- A(a )| < C'e S n . 

Next we show (f3~T|) is true for all y > y n , \x\ < N, and min/^i^ |x — Xfc| > ei. 
Suppose (|31l) is false. By continuity there exists a subsequence 77; , zi 6 T, and a; 
with I Re(a/)| < N and Im(a;) > y n , such that 

|A„>,)-AK>l = ^pV 



Then 



|A n! (a0-A(a0|<| 
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for all I greater than some L. By a similar argument as above and Lemma [ 
have 

min |A n ,(aj) - m fc (otj)| > ^ 



we 



for a\\l > L and hence 



\A ni (ai) - A(ai)\ < 



Ceo 



a contradiction. 

Finally, for the case where \a — Xk\ < ei for k = 1 or 2, we apply a similar 
argument and Lemma [23] to obtain 



\A n (a) - A(a)\ = 



Xk\ 



= 0(W /2 )- 



□ 



4.5. Least Singular Value Bound. A key part of proving Lemma[S]is to control 
the least singular value of Y — zl. Equivalently, we wish to obtain control of the 
norm of the inverse — zl)~ 1 \\. 

We will obtain a bound using the results of Tao and Vu in [28 . We present 
Tao and Vu's bound on the least singular value below, which only requires a finite 
second moment assumption on the entries of the matrix. 

Theorem 28 (Tao-Vu; Least singular value bound). Let A,C\ be positive con- 
stants, and let £ be a complex-valued random variable with non-zero finite variance 
(in particular, the second moment is finite). Then there are positive constants B 
and C2 such that the following holds: if N n is the random matrix of order n whose 
entries are i.i.d. copies of and M is a deterministic matrix of order n with 
spectral norm at most n 1 , then, 

(32) P(||(M + iV„r 1 || >n B ) <C 2 n~ A . 

Remark 29. We note that the bound in (I32|) is independent of the matrix M. In 
particular, this bound holds for any deterministic matrix of order n with spectral 
norm at most n Cl . 

We will prove an analogous version of Theorem [28] for the matrix Y. We first 
need the following bounds for the norm of Y. 

Lemma 30. We have the following bounds for the norm ofY. 

\\Y\\ =0(n) a.s., 

E||y|| = 0(n). 

We also have that for any 1 < a < m, 

(33) E\\X a \\ = 0(n). 
Proof. We note that 

/ X^Xm \ 



Y*Y = 



xtx t 
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and hence the singular values of Y are the the union of the singular values of Xk 
for 1 < k < m. Let si, . . . s mn denote the singular values of Y. Then 



1 ran mn 

-\\y\\<— y>< — y>?+i 

-E E |(*» 



mn mn * — ' mn 

i=i j=i 



< — TiY*Y + 1 

77171 777,77 

/c— 1 l<ij<n 



2 a.s. 



as 77 — » oo by the law of large numbers. The same argument shows that 

1 1 i ^ 2 



mi<— E E E 

77777 ^— ' ^— ' 

A similar argument verifies ([33]). 



777,77 777,71, 

/c— 1 l<*,.7<n 



1 = 2. 



□ 

Theorem 31 (Least singular value bound for Y). Let Y be the (mn) x (77777) 
matrix defined in ([2]) and let A be a positive constant. Then, under the hypothesis 
of Theorem there exists positive constants B and C ( depending on both A and 
m) such that 

pflKr-zi)- 1 !! > 77 s ) < Cn- A 

uniformly for \z\ < M%. 

Proof. We remind the reader that (Y — zl)^ 1 is an (mn) x (77777) matrix and again 
refer to the m 2 blocks (Y — zl)~\ each of size 77 x 77. A simple computation reveals, 
that (when invertible) (Y — zl)~\ has the form 



z K X n ■ ■ ■ Xj, [Xi t ■■■X i: 



z r y x 



where k, I, q, r are nonnegative integers no bigger than m, the variables K,l,q,r,j\,... ji, 
depend only on a and &, and the indices i%, . . . ,i q are all distinct. 
By the definition of the norm, we have that 

\\(Y-zI)- l \\<C m max \\(Y - zl)^ b \\ < C m E W( Y ~ ^VbW 

l<a,b<rn * — 4 

l<a,6<m 

where C m is a constant that depends only on m. Thus, it is enough to show that 
given a positive constant A, there exists B and C such that 



(\\z«X jl ---X jl (X il ---X iq -z r ) 1 ||>77 B ) 



< Cn 



uniformly for \z\ < M%. 
So we have, 

P (\\z«X h • ■ -X jt (X n ■■■X lq ~ z-y 1 II > 77 B ) 

< mP > 77, B A" i+2 )) + P (|| (X h ■■■X iq - z^ 1 || > 77 S /(™+ 2 )) 

for \z\ < M2 and 77 large. The first term can be estimated by Markov's inequality 

P f llXiH > n B ^ m+2 A < yill = o^-B/tm+aj+ix 
V" " _ / _ n B '( m+2 > 
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since E||Xi|| = 0(n) by Lemma l30l Therefore, this term is order n~ A by taking 
B > (to + 2) (A + 1). So, it is now enough to show that given a positive constant 
A, there exists B and C such that 

P (|| (X n ■■■X iq - O" 1 || > n s ) < Cn- A . 

We note that, 

(X h ■■■X l -z'Y 1 = Xr 1 ■ ■ ■ xr 1 ( X h - z r X~ x ■ ■ ■ Xr 1 
By Theorem \T%\ there exists positive constants B and C such that 



(34) 



(\\X^ . . . X^\\ > n B ) < mP (\\X^\\ > n B ' m ) < Cn 



Thus, we only need to show that given A there exists B and C such that 



z T xr 1 '-x: 



> n B ) < Cn 



We then have that 



(x h -z r X^--X^y 1 > 



Xi, -z r X7 x --- Xr 1 ) > 



\Xi 1 ---X- 1 \\ < n Cl 

I l q 12 II — 



xP(\\X- 1 ---X- 1 \\<n^ 



r \r — 1 



Xi 1 — z r X, 



■x: 



> n 1 



I tq t2 11 — 



\X^...X^\\>n c 



-A 



< Cn 

where the first term is controlled by Theorem [28] (in particular, see Remark [ 
and the second term is estimated as in (|3"4"|) . This completes the proof of the 
Theorem. □ 

4.6. Proof of Lemma [5j 

Proof of Lemma In order to finish the proof of Lemma [5] we need to show ([5]) 
holds. By integration by parts, we have 



(g n (s,t) -g(s,t))e 



isu+itv 



dtds 



iur(s,t)dtds + / (t(A, t) - t(-A, t))dt 
zeT J\t\<A 3 



\t\<l + c 



(r(V(l + e) 2 -t 2 , t) - r(-v/(l + e) 2 -i 2 , *)) dt 

1+6 V ' 

+ 1 (r(V(l " e) 2 - t*,t) - t(-v/(1 - ef - t\i)) dt 

J\t\<l-e V ' 



where 



r(s, t) = e 



ius-\-ivt 



hxx(y n (dx, z) — v(dx, z)). 
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Let e„ 
bility 1, 



By Lemma [ 



.■OS/IS 



By Theorem [3T] and the Borel-Cantelli lemma, with proba- 



lim 



lim 



hixv n (dx, z) 



\n.xv{dx, z) 



dtds = 0. 



dtds = 0. 



By Lemma 1301 there exists k > such that the support of v n (-,z) lies in [0, ku 2 ] 
for all z G T. Thus, by Lemma [27] 



lax(v n (dx, z) — v(dx, z)) 



dtds 



lnx(i/ ra (dx, z) — v(dx, z)) 



dtds 



< C [| ln(e„)| + ln(Kn 2 )] max \\v n {-, z) — v{-, z)\\ — !• a.s. 
Therefore, with probability 1, 

lim iu I t(s, t)dsdt = 0. 

«^°° J z( zT 

In a similar fashion, we can show that the boundary terms satisfy the following 

r(±A,t)dt = a.s., 



and 



lim 

n— too 

lim 

n—t-oo 

lim 



l*l<yl 3 



|t|<l+e 



r(± v /(l + e) 2 -t 2 , t)dt = a.s., 



r(± v / (l - e) 2 - t 2 , t)dt = a.s. 



'|t|<l-6 

The proof of Lemma [5] is complete. 



□ 



Remark 32. After we finished our paper, we learned about a very recent preprint 
[16] where F.Gbtze and A. Tikhomirov proved the convergence of the expected 
spectral distribution E/ix to the limit defined by ([T]) under the assumption that 
the matrix entries are mutually independent centered complex random variables 
with variance one. Our approach is different from the one used in |16j . We are 
grateful to Z. Burda, T. Tao and A. Tikhomirov for useful comments regarding the 
results of the paper. In addition, we are grateful to unanimous referees for valuable 
and constructive criticism regarding the proofs of Theorem 15 and Lemma 19, and 
for bringing to our attention the reference |13j where a similar result was obtained 
for m — 2. 
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